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FINAL  PROGRESS  REPORT 

INTERACTING  SITES  IN  NOVEL  POLYMERIC  PROTEINS 


1.  FORWARD 

The  primary  sequence  of  a  protein  leads  to  its  higher-order  (secondary,  tertiary  and 
quaternary)  structure;  sites  of  intraprotein  and  protein-protein  interactions  are  contained  therein. 
Higher-order  structures  and  properties  of  a  native  protein  are  acquired  through  protein  folding 
pathways  that  are  best  understood  for  small  globular  proteins^■^  Fibrous  proteins  comprise 
essential  building  blocks  for  extracellular  structures®.  The  chemical  and  physical  properties  of 
several  (such  as  collagen,  keratin  and  silk)  are  known  and  provide  the  basis  for  novel  protein-based 
biomolecular  materials®.  However,  in  contrast  to  globular  proteins,  little  is  known  about  the 
structure  and  folding  pathways  of  fibrous  proteins. 

Aquatic  larvae  of  the  midge,  Chironorrm  tentans,  spin  silk.  In  contrast  to  spiders  and 
silkworms,  midge  silk  contains  1%  cysteine  (Cys)^.  Six  midge  silk  proteins  are  composed  of  50- 
130  copies  of  tandemly  repeated  sequences  with  two  or  four  invariant  Cys  residues®.  The 
evolutionary  conservation  of  these  residues  implies  functional  significance.  This  notion  is  supported 
by  the  fact  Cys  residues  in  a  bacterially  expressed  core  repeat  from  silk  protein  spla  can  form 
in/ramolecular  disulfide  bonds  in  vitro^  and  native  silk  proteins  form  i/irermolecular  disulfide  bonds 
in  vivo^°.  These  residues  are  therefore  sites  of  intraprotein  and  protein-protein  interactions  and 
provide  a  unique  opportunity  to  study  the  folding  pathway  and  associative  properties  of  fibrous 
proteins. 

Knowledge  gained  about  the  structure  and  interactive  sites  of  aquatic  midge  silk  proteins 
will  further  impact  the  potential  biotechnological  application  of  these  proteins  as  novel  biomolecular 
materials. 
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4.  STATEMENT  OF  PROBLEM  STUDIED 

The  objective  of  tiiis  project  was  to  study  the  structure,  folding  and  association  of  midge 
silk  proteins.  We  initially  focus^  on  Cys  residues  that  provide  sites  of  intraprotein  and  protdn- 
protein  interactions  in  rCAS,  a  recombinant  protein  modeled  after  core  rqjeats  in  midge  silk 
protein,  spla®.  We  also  examined  which  sUk  proteins  associate  with  ^Is.  Experiments  outlined  in 
this  proposal  were  designed  to  answer  the  following  questions. 

A.  Which  Cys  residues  are  critical  to  the  pathway  for  z/imanolecular  disulfide  bond 
formation  in  rCAS?  [Answer:  all  can  participate,  order  of  priority  unknown.] 

B.  Are  intramolecular  disulfide  bonds  required  to  stabili2e  the  higher-order  structure  of 
rCAS?  [Answer:  not  testable  since  unable  to  ascertain  the  higher  order  structure.] 

C.  What  conditions  promote,  and  which  Cys  participate  in,  formation  of  intermolecular 
disulfide  bonds?  [Answer:  some  pairwise  associations  preferred.] 

D.  How  and  where  do  other  silk  proteins  interact  with  spis?  [Answer:  one  or  few  molecules 
of  spl40  and  spl85  form  intermolecular  disulfide  bonds  with  spIs;  limited  number 
implies  sites  are  Cys  within  terminal  domains  rather  than  internal  arrays  of  repeats]. 

Unanticipated  problems  were  encountered  with  rCAS,  namely  conflicting  structural  predictions  and 
the  inability  to  refold  rCAS  into  a  single  isomer.  However,  two  new  aspects  of  this  project  emerged 
by  studying  additional  sUk  proteins  along  the  way.  They  enabled  us  to  answer  the  following 
questions. 

E.  What  evidence  is  there  that  glycosylation  of  aquatic  silk  proteins  is  important?  [Answer: 
number  and  location  of  N-liriked  glycosylation  motifi  in  sspl60  conserved  among  two 
species  in  spite  of  genetic  polymorphism.] 

F.  Do  spl85/sp220  homologs  and  spl95  contain  repeats  with  conserved  Cys?  [Answer: 
spl85/sp220  are  non-repetitive  but  contain  72  blocks  of  20-28  residues  with  a  novel 
Cys-containing  motif  whereas  spl95  is  composed  of  tandem  arrays  of  25  residue  repeats 
that  include  yet  another  novel  Cys-containing  motif] 

5.  SUMMARY  OF  MOST  IMPORTANT  RESULTS 

A.  Recombinant  Silk  Proteins 

We  successfully  produced  and  purified  recombinant  rCAS  derivatives  with  all  possible 
combinations  of  zero  to  four  Cys  to  Ala  substitutions  (acquired  by  in  vitro  mutagenesis  of  the 
synthetic  rCAS-encoding  gene). 

CD  and  FTTR  spectra  were  obtained.  All  proteins  had  significant  secondary  structure; 
however,  these  methods  significantly  contradicted  each  other’s  prediction  of  the  percentage  of 
helix,  turn  and  sheet.  Since  basis  spectra  data  for  both  metiiods  were  derived  from  globular 
proteins,  we  concluded  neither  is  suitable  until  characterized  fibrous  protein  standards  exist. 

Solvent  and  thermal  denaturation  studies  brought  more  conflicts;  rather  than  copperative 
denaturation  curves,  linear  curves  we  obtained  suggesting  all  proteins  were  random  coHs.  Identical 
results  were  obtained  using  proteins  with  and  without  Cys,  negating  the  use  of  this  method  to  study 
how  intramolecular  disulfide  bonds  stabilized  secondary  structure. 

One  positive  result  was  obtained:  upon  reduction  and  re-oxidation,  all  two-Cys-containing 
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derivatives  quantitatively  refonned  one  intramolecular  disulfide  bond.  This  indicates  either 
unparalleled  flexibility  or  artifactual  refolding  of  rCAS  derivatives.  At  higher  concentrations,  some 
showed  prefence  for  formation  intermolecular  disulfide  bonds  (Fig.  1).  When  three  or  more  Cys 
were  present,  HPLC  indicated  refolding  resulted  in  multiple  conformations.  This  negated  all 
chances  to  perform  higher  order  structural  studies  such  as  NMR  and  crystallography. 

B.  Native  Silk  Protein  Complexes 

Progress  was  made  studying  native  siUc  protein  complexes  (listed  below  in  section  6  as 
publication  P3).  We  determined  the  approximate  mass  (2400  kDa)  and  first  qualitative  description 
of  the  composition  of  a  fibrous  protein  quaternary  structure.  This  complex  contains  spl,  spl85, 
spl40  covalently  linked  by  disulfide  bonds  and  possibly  sp40,  spl7  and  spl2  linked  non-covalently 
(Fig.  2).  The  130  nm  diameter  lattice-like  complexes  form  oligomers  that  can  associate  into  multi- 
stranded  beaded  fibers  in  a  concentration-dependent  manner.  Disulfide  bond  reduction  dissociates 
spl  85  and  spl40  from  spis,  supporting  the  notion  that  Cys  conserved  in  these  proteins  are  also  the 
site  of  protein-protein  interactions. 

Analytical  ultracentifugation  studies  were  not  feasible  due  to  insolubility  of  silk  proptein 
complexes  in  UV-transparent  buffers  required  to  profile  the  proteins  at  low  wavelengths  (aquatic 
silk  proteins  contain  relatively  few  aromatic  amino  adds) 

C.  Conservation  of  Glycosylation  Motifs  in  a  Silk  Protein 

We  suspect  the  “beads”  on  silk  fibers  assembled  from  high  molecular  mass  complexes  in 
vitro  are  carbohydrate  chains.  Glycosylation  of  aquatic  silk  proteins  has  been  reported  but  largely 
ignored.  We  therefore  pursued  one  silk  protein  whose  glycosylation  may  be  extreme. 

sspl60  is  the  special  lobe-specific  160-kDa  silk  protein  whose  synthesis  is  limited  to  four 
cells  that  surround  the  salivary  duct  where  fibers  form  in  vivo.  Lectin  binding  and  affinity 
chromatography  indicate  this  silk  protein  contains  both  N-  and  0-linked  sugar  sspl60  cDNA 
was  first  isolated  firom  Chironomus  thummf^,  a  spedes  for  which  little  molecular  biological  data 
exist  for  sUk  proteins;  this  protein  has  never  been  observed  in  C.  tentans,  the  species  with  the 
largest  database.  We  chose  to  bridge  this  gap  and,  in  doing  so,  answered  two  classical  questions 
that  were  posed  in  this  biological  system  over  30  years  ago. 

C.  thummi  sspl60  cDNA  was  used  to  isolate  the  corresponding  gene  (publication  P2)  and 
homologous  cDNA  and  gene  fi'om  Chironomus pallidivittatus  (publications  P5  and  P6),  the 
closest  relative  to  C.  tentans.  Their  comparison  revealed  evolutioneuy  conservation  of  12-13 
copies  of  an  N-linked  glycosylation  motif  sequestered  in  two  regions  of  sspl60  where  codon 
deletions  are  frequent  (publication  P5).  These  deletions  appear  to  result  from  slipped-strand 
mispairing  among  arrays  of  [AC A]  repeats.  The  C  pallidivittatus  spl 60  gene  mapped  to  Balbiani 
ring  4  (BR4)  on  polytene  chromosome  IV,  the  last  BR  without  an  identifiable  gene.  This  result 
fiilfilled  Beermann’s  prediction  that  all  BRs  contain  a  gene  encoding  a  major  tissue-specific  (silk) 
protein. 

Comparison  of  an  sspl60  gene  from  C.  thummi  (publication  P2)  and  two  alleles  from 
C.  pallidivittatus  (publication  P6)  revealed  differences  in  intron  sizes  and  downstream  flanks. 
All  are  attributable  to  slipped-strand  mispairing  among  5-bp  repeats  contained  therein 
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(publication  P6).  Large-scale  but  similar  dynamics  may  have  resulted  in  gene  deletion:  probes 
for  the  C  pallidivittatiis  gene  and  flanks  were  used  to  demonstrate  that  the  apparent  absence  of 
BR4  on  C  tentans  polytene  chromosomes  is  due  to  the  fact  that  the  sspl60  gene  was  deleted 
(publication  P7).  The  upstream  deletion  breakpoint  has  been  mapped  with  nucleotide 
precision;  the  downstream  breakpoint  is  less  clear  due  to  repeats. 

The  above  study  increased  our  need  for  an  efficient  DNA  sequencing  strategy.  Thus  we 
invested  time  in  combining  a  new  plasmid  vector  for  transposon-facilitated  DNA  sequencing  with 
long  and  accurate  PCR  mapping  of  transposon  insertion  sites  (publication  P4). 

D.  Otiher  Novel  Cys-Containing  Motifs 

We  completed  the  full-length  cDNA  sequence  of  C.  pallidivittatiis  spl85  and  C. 
thummi  sp220  (publication  PI)  and  compared  them  to  C.  tentans  spl85.  These  silk  proteins 
contain  72  blocks  of  20-28  residues,  61%  of  which  contain  the  novel  Cys-containing  motif: 
(X5-8l-Cys-(X5^-(Trp/Phe/TyrV(X4VCys-X-Cys-X-Cys. 

Only  120  nucleotides  of  cDNA  have  ever  been  reported^^  for  ^195  but  they  encode  two 
Cys  residues.  We  therefore  decided  to  attempt  to  complete  the  6-kb  cDNA  sequence  encoding  this 
protein.  Tandem  arrays  of  75-bp  protein-coding  repeats  within  this  cDNA  render  it  unstable  when 
propagated  in  bacterid  plasmids.  Nonetheless  we  have  acquired  over  4  kb  of  cDNA  sequence  and 
found  blocks  of  nearly  perfect  25-residue  repeats  that  contain  yet  another  novel  Cys-containing 
motif:  DTPANKKWNENrrC(C/SlLECKT(E/V)KPKP(D/01  (publication  P8). 

£.  Summary 

Results  obtained  from  this  project  have  established  the  following.  1)  Aquatic  silk 
proteins  can  be  expressed  and  purified  from  bacteria  in  large  quantities,  however,  fibrous 
proteins  may  be  unexpectedly  difficult  to  analyze  due  to  lack  of  basis  spectra  and  those  with 
Cys  may  be  susceptible  to  formation  of  multiple  conformers  during  refolding.  2)  Aquatic  silk 
proteins  contain  more  Cys  motifs  than  previously  thought  (Fig.  3).  3)  The  abundance  and 
conservation  of  glycosylation  sites  on  some  aquatic  silk  proteins  suggest  such  post-translational 
modification  merits  study  before  biotechnological  applications  are  considered.  The  same  may 
apply  to  silks  in  general;  spider  sUk  is  glycosylated  too. 

6.  PUBLICATIONS  AND  TECHNICAL  REPORTS 

PI.  Case,  S.T,  Cox,  C.,  Bell,  W.C.,  Hoffman,  R.T.,  Martin,  J.  and  Hamilton,  R. 
Extraordinary  conservation  of  cysteines  among  homologous  Chironomus  sUk  proteins  spl85  and 
sp220.  J.  Mol.  Evol.  44:  452-462  (1997). 

P2.  Berezikov,  E.,  Blinov,  A.G.,  Scherbik,  S.,  Cox,  C.K.  and  Case,  S.T.  Structure  and 
polymorphism  of  the  Chironomus  thummi  gene  encoding  special  lobe-specific  silk  protein, 
sspl60.  Gene  223:347-354  (1998). 


4 


Steven  T.  Case 


P3.  Case,  S.T.  and  Thorton,  J.R.  High  molecular  mass  complexes  of  aquatic  silk  proteins. 
Int.  J.  Biol.  Macromol.  (in  press,  1999). 

P4.  Cox,  C.K.,  Anido,  A.E.  and  Case,  S.T.  Efficient  transposon-facilitated  DNA 
sequencing  of  large  target  DNAs.  (submitted) 

P5.  Case,  S.T.,  Cox,  C.K.  and  Anido,  A.I.  The  last  Balbiani  ring:  BR4  in  Chironomus 
pallidivittatus  encodes  sspl60,  a  special  lobe-specific  160-kDa  silk  protein,  (submitted) 

P6.  Cox,  C.K.,  Anido,  A.E.  and  Case,  S.T.  Polymorphic  alleles  of  the  Chironomus 
pallidivittatus  gene  encoding  sspl60.  (in  preparation) 

P7.  Anido,  A.E.,  Cox,  C.K.  and  Case,  S.T.  A  molecular  explanation  for  the  fate  of  BR4 
in  Chironomus  tentans.  (in  preparation) 

P8.  Case,  S.T.,  Goel,  A.,  Cox,  C.K.,  Huff,  E.  and  Donhardt,  A.  Novel  cysteine- 
containing  motif  in  Chironomus  tentans  silk  protein,  spl95.  (in  preparation) 

Anido,  A.E.  (1996)  “Use  of  a  Novel  Plasmid  for  Transposon-Mediated  DNA  Sequencing”, 
Honor’s  Thesis,  Millsaps  College,  Jackson,  MS  . 

Donhardt,  A.M.  (1998)  “cDNA  for  a  195-kDa  Silk  Protein  from  Larval  Salivary  Glands  of 
Chironomus  tentans”.  Honors  Thesis,  Mississippi  College,  Clinton,  MS. 

7.  PARTICIPATING  SCIENTIFIC  PERSONNEL 

Jennifer  R.  Thornton, 

Sudha  Govindichar 
Carol  K.  Cox 

Undergraduate  AASERT  Awardees 
Amy  Donhardt, 

Aimee  E.  Anido, 

Elizabeth  Huff, 

Andrew  O’Dell, 

8.  REPORT  OF  INVENTIONS  None 

9.  BIBLIOGRAPHY 

1.  Afinsen,  C.B.  (1973)  Science  181:223-230 

2.  Rothwarf,  D.M.,  and  Sheraga,  H.A.,  (1993)  Biochemistry  32:2671-2679 

3.  Creighton,  T.E.  and  Goldenberg,  D.P.  (1984)  J.  Mol.  Biol.  179:497-526 

4.  Weismann,  J.S.  and  Kim,  P.S.  (1991)  Science  253:1386-1393 


5 


Steven  T.  Case 


5.  Fibrous  Protein  Structure,  (J.M.  Squire  and  P.J.  Vibert,  eds.)  Academic  Press, 

NY,  1987,  pp.l- 535 

6.  Biomolecular  Materials,  Materials  Research  Society  Symposium  Proceedings, 
Pittsburgh,  PA,  1993,  232:1-288 

7.  Case,  S.T.,  Powers,  J.M.,  Hamilton,  R.,  Burton,  M.J.  (1994)  in  Silk  Polymers: 
Materials  Science  and  Biotechnology,  (D.  Kaplan  et  al.,  eds.),  American  Chemical 

Society  Symposium  544,  Washington,  pp.  91-97 

8.  Case,  S.T.  and  Wieslander,  LW.  (1992)  Res.  Prob.  CeU  DifF.  19: 187-226 

9.  Smith,  S.V.,  Correia,  J.J.  and  Case,  S.T.  Prot.  Sci.  (1995)  4:945-954. 

10.  Case,  S.T.  and  Thorton,  J.R.  High  molecular  mass  complexes  of  aquatic  silk 
proteins.  Int.  J.  Biol.  Macromol.  (in  press,  1999). 

11.  Hoffman,  R.T.,  Schmidt,  E.R.  and  Case,  S.T.  (1996)  J.  Biol.  Chem.  271:9809- 
9815. 

12.  Dreesen,  T.D.,  Bower,  J.R.  and  Case,  S.T.  (1985)  J.  Biol.  Chem. 

60:11824-11830. 

10.  APPENDIXES  - 

Attached  are  3  figures. 


6 


Steven  T.  Case 


Figure  1.  Disuinde  Linkages  in  rCAS.  Disulfide  Bonds  in  rCAS: 

INTRAmolecular  (monomer)  and 
INTERmolecular  (dimer  &  multimer) 


MONOVMULTI 


MONOVMULTI 


Figure  2.  Dissociation  of  aquatic  silk  proteins. 
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Figure  3.  Cys-containing  motifs  in  aquatic  silk  proteins. 
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